Systems and methods for managing key-value stores

ABSTRACT

Systems and methods for managing key-value stores are disclosed. In some embodiments, the systems and methods may be realized as a method for managing a key-value store including creating an uncompressed tree of key-value pairs, monitoring the growth of the uncompressed tree, compressing the uncompressed tree when the uncompressed tree meets and/or exceeds a specified threshold, and creating a new empty uncompressed tree.

BACKGROUND

A key-value store is a data organization system where each “value” to bestored is associated with a unique “key.” The basic operations of akey-value store are insert and lookup commands. An insert command isissued to store the key-value pairs on the storage device. A lookupcommand is issued to retrieve the value associated with a specified key,if the key and value exist. A key-value system must be able to insertand look up key-value pairs in a time-efficient manner. Ideally,key-value pairs are stored and maintained in sorted order by key on thestorage device. Algorithmically, inserting and looking up keys within asorted sequence is substantially faster in time than performing the sameoperations on an unsorted sequence. In order to maintain a sequence ofkey-value pairs in sorted order, it is necessary to reorganize theexisting data when new key-value pairs are being inserted.

However, as the size of key-value stores increase the maintainingefficiency of looking up a key and inserting a key becomes morechallenging to maintain.

Caching may be used to store key-value stores for increased performanceof a key-value store. Caching is the temporary storage of data forsubsequent retrieval. Memory devices are often used to store dataprovided by a computer program. Examples of memory storage devicesinclude, but are not limited to, solid-state drives, hard disk drives,and optical drives. These types of storage devices are inexpensive andhold large amounts of data. However, one tradeoff for their economicvalue is that they are slow compared to other components used in acomputer. For example, a consumer hard drive can store terabytes of datacheaply, but has a maximum theoretical transfer rate of 300 megabytes(MB) per second. Random access memory (RAM) is faster in performance buthigher in price, with a maximum theoretical transfer rate of 12.8gigabytes (GB) per second. A central processing unit (CPU) withspecialized memory known as level 1 (L1) cache or level 2 (L2) cache haseven better performance but at an even higher price, with a transferrate of 16 GB per second, or over fifty times faster than the storagedevice.

A technique known as caching may be used to increase, or accelerate,overall system performance. Caching may be used to store data requestedfrom one component, into another component, to speed future requests forthe same data. The data stored in a cache often may be values previouslyrequested by a software application, by an operating system, or byanother hardware component. Caching organizes a small amount offast-access memory and a large amount of slow-access memory. The firsttime that a value is requested, the data is not in the cache, so therequested value is retrieved from the slow-access memory. In a cache,when the value is retrieved from the slow-access memory, the value issent to the component that requested it, and the value also is stored inthe fast-access memory for future requests. The next time that the samevalue is requested by the operating system or by any other program, thevalue is retrieved from the fast-access memory, with the result that theoverall system performance is faster, or accelerated, by virtue of thevalue being available from the fast-access memory. By using fastermemory components to cache data, more requests can be served from thecache instead of the slower storage device, and faster overall systemperformance is realized.

Key-value stores may be used to support fast retrieval and insertion ofvalues to support numerous applications. Key-value stores may beimplemented on numerous storage platforms which may or may not includecached platforms.

SUMMARY OF THE DISCLOSURE

Systems and methods for managing key-value stores are disclosed. In someembodiments, the systems and methods may be realized as a method formanaging a key-value store including creating an uncompressed tree ofkey-value pairs, monitoring the growth of the uncompressed tree,compressing the uncompressed tree when the uncompressed tree meetsand/or exceeds a specified threshold, and creating a new emptyuncompressed tree.

Compression of an uncompressed tree may occur in response to one or morefactors. For example, an uncompressed tree may be compressed based on apercentage of used space in a uncompressed tree, a number of keys in auncompressed tree, an amount of storage used by an uncompressed tree, apercentage of storage used by an uncompressed tree, a percentage ofmemory used by an uncompressed tree, an amount of memory used by anuncompressed tree, a percentage of cache used by an uncompressed tree,an amount of cache used by an uncompressed tree, a key retrievalperformance metric, a key insertion performance metric, a key deletionperformance metric, a range query performance metric, a current load ona processor, a current I/O load on a cache, a current disk I/O load, acurrent network I/O load, or other metrics specified by a user.Compression may be performed using one or more algorithms.

In accordance with further aspects of this exemplary embodiment, thesystems and methods for managing a key-value store may includemaintaining a list of an uncompressed tree and one or more compressedtrees and searching one or more trees in the list to identify a key. Inaccordance with additional aspects of this exemplary embodiment, thesystems and methods for managing a key-value store may include receivinga request to insert new key-value data, searching one or more key-valuetrees and determining whether the key exists in an existing tree. In theevent that the key is identified in a tree, the data at an identifiedoffset associated with the identified key may be updated with the newvalue. In the event that the key is not identified in a tree, a newkey-value pair may be inserted in an uncompressed tree.

In accordance with additional aspects of this exemplary embodiment, thesystems and methods may further include receiving a request to delete akey-value pair from a tree. The key-value pair may be located via asearch of one or more trees. In the event that the key-value pair islocated in an uncompressed tree, the key-value pair may be deleted. Inthe event that the key-value pair is located in a compressed tree, thekey value pair may be marked for deletion. In accordance with furtheraspects of this exemplary embodiment, an amount, a percentage, oranother measure of deleted keys in a compressed tree may be measured. Ifa percentage of a compressed tree used by deleted keys meets or exceedsa specified threshold, reclamation of the space holding the deleted keysmay be performed. Key-value pairs which have not been deleted but whichare in a compressed tree holding more than a specified percentage ofdeleted keys may be copied to an uncompressed tree. The compressed treemay then be deleted. Reclamation of unused space and deletion of keysmay be initiated by one or more events. A delete request may trigger anevaluation of space allocated to deleted keys in a particular tree, anevaluation of available storage, an evaluation of an amount of treeand/or key-value data loaded into memory, an evaluation of availablememory, an evaluation of an amount of tree and/or key-value data loadedinto cache, a comparison of a current processing and/or cache loadversus an estimated reclamation time or load, and/or an evaluation ofone or more performance metrics. Deletion and/or reclamation of spacemay also be performed at other times (e.g., periodic intervals such asnightly or during off-peak hours, or in response to an administratorcommand).

Deletion and/or reclamation of space may occur in response to one ormore factors. For example, space in a compressed tree having keys markedas deleted may be reclaimed based on a percentage of unused space in acompressed tree, a number of deleted keys in a compressed tree, anamount of storage used by a compressed tree, a percentage of storageused by a compressed tree, an amount of memory used by a compressedtree, a percentage of memory used by a compressed tree, a percentage ofcache used by a compressed tree, an amount of cache used by a compressedtree, a key retrieval performance metric, a key deletion performancemetric, a range query performance metric, a current load on a processor,a current network I/O load, a current amount of available memory, acurrent cache I/O load, a current disk I/O load, or other metricsspecified by a user.

In accordance with additional aspects of this exemplary embodiment, thesystems and methods may further include receive a range query. The rangequery may be processed to retrieve one or more keys in a specified rangeby traversing a tree structure to identify the one or more keys betweentwo specified keys marking the boundaries of the range.

In accordance, with one or more embodiments, the trees used to holdkey-value pairs may comprise prefix trees, tries, and/or an ordered treedata structure. Maintaining a number of compressed tries with short keysmay facilitate fast retrieval. Maintaining a single uncompressed treemay facilitate fast insertion. Periodically compressing an uncompressedtree based on one or more metrics and/or thresholds may prevent anunchecked growth of an uncompressed tree and degradation of performance.Marking keys in a compressed tree as deleted without requiring arebuilding of the compressed tree may avoid a performance hit. Space maybe reclaimed as discussed above during off-peak hours by migrating keyswhich have not been deleted to an uncompressed tree and deleting thecompressed tree.

According to some embodiments, a list or other data structure of one ormore data trees may be maintained. For example, a linked list may referto a plurality of data trees. A first item in a linked list may be areference to an uncompressed tree. Other items in the linked list mayinclude one or more references to compressed trees. References to treesin a list may be inserted as trees are created. References to trees maybe reordered according to one or more factors. For example, metrics maybe maintained of trees containing the most frequently requestedkey-value pairs, least frequently requested key-value pairs, mostfrequently updated key-value pairs, trees having a highest percentage ofdeleted keys, trees having a lowest percentage of deleted keys, or otherfactors. A tree having produced a higher number of hits or a higherpercentage of hits to misses than another tree may be listed earlier ina list or data structure, which may cause it to be traversed earlier ina search for a key.

According to some embodiments, a list, linked list, or other datastructure containing references to trees may be maintained in memory ofa server or host device (e.g., in DRAM). A list, linked list or otherdata structure may reference trees by an offset indicating a location instorage or a file of the tree. For example, a linked list in memory maycontain an offset indicating a starting position of a tree. The offsetmay be a location in a file. The file may be located on disk, in a cachedevice (e.g., a SSD), or other electronic storage. According to someembodiments, a list, a linked list, or other data structure containingreferences to trees may refer to trees on multiple storage devices. Forexample, a linked list may refer to a plurality of trees on a first SSD,a plurality of trees on a second SSD, a plurality of trees acrossseveral SSDs associated with a host, one or more trees on SSDs and oneor more trees on disk, or other arrangements of cache devices andstorage. In some embodiments, one or more trees may be loaded in part orentirely into memory.

In accordance with further aspects of this exemplary embodiment, thehost device may comprise at least one of: an enterprise server, adatabase server, a workstation, a computer, a mobile phone, a gamedevice, a personal digital assistant (PDA), an email/text messagingdevice, a digital camera, a digital media (e.g., MP3) player, a GPSnavigation device, and a TV system.

The present disclosure will now be described in more detail withreference to exemplary embodiments thereof as shown in the accompanyingdrawings. While the present disclosure is described below with referenceto exemplary embodiments, it should be understood that the presentdisclosure is not limited thereto. Those of ordinary skill in the arthaving access to the teachings herein will recognize additionalimplementations, modifications, and embodiments, as well as other fieldsof use, which are within the scope of the present disclosure asdescribed herein, and with respect to which the present disclosure maybe of significant utility.

BRIEF DESCRIPTION OF THE DRAWINGS

In order to facilitate a fuller understanding of the present disclosure,reference is now made to the accompanying drawings, in which likeelements are referenced with like numerals. These drawings should not beconstrued as limiting the present disclosure, but are intended to beexemplary only.

FIG. 1 shows an exemplary block diagram depicting a solid state devicein communication with a host device, in accordance with an embodiment ofthe present disclosure.

FIG. 2 depicts an exemplary module for managing key-value stores, inaccordance with an embodiment of the present disclosure.

FIG. 3 depicts a flowchart illustrating key-values stores during treecreation, in accordance with an embodiment of the present disclosure.

FIG. 4 depicts a flowchart illustrating key-value stores during keyinsertion, in accordance with an embodiment of the present disclosure.

FIG. 5 depicts a flowchart illustrating reclamation of key-valuestorage, in accordance with an embodiment of the present disclosure.

FIG. 6 shows an exemplary diagram of a plurality of key-value stores, inaccordance with an embodiment of the present disclosure.

DESCRIPTION

The present disclosure relates to management of key-value stores.Management of key-value stores may include creating an uncompressed treeof key-value pairs, monitoring the growth of the uncompressed tree,compressing the uncompressed tree when the uncompressed tree meetsand/or exceeds a specified threshold, and creating a new emptyuncompressed tree.

Managing a key-value store may include maintaining a list of anuncompressed tree and one or more compressed trees and searching one ormore trees in the list to identify a key. The systems and methods formanaging a key-value store may include receiving a request to insert newkey-value data, searching one or more key-value trees and determiningwhether the key exists in an existing tree. In the event that the key isidentified in a tree, the data at an identified offset may be updatedwith the new value. In the event that the key is not identified in atree, a new key-value pair may be inserted in an uncompressed tree.Key-value store management systems and methods are discussed in furtherdetail below.

Turning now to the drawings, FIG. 1 is an exemplary block diagramdepicting a solid state device in communication with a host device, inaccordance with an embodiment of the present disclosure. FIG. 1 includesa number of computing technologies such as a host 10, application 20,caching software 30, a bus 50, a cache device 60, a memory devicecontroller 70, a memory 80, an Error Correcting Code (ECC) memory block90, and a host interface 100. The bus 50 may use suitable interfacesstandard including, but not limited to, Serial Advanced TechnologyAttachment (SATA), Advanced Technology Attachment (ATA), Small ComputerSystem Interface (SCSI), PCI-extended (PCI-X), Fibre Channel, SerialAttached SCSI (SAS), Secure Digital (SD), Embedded Multi-Media Card(EMMC), Universal Flash Storage (UFS) and Peripheral ComponentInterconnect Express (PCIe).

As used herein, the phrase “in communication with” means in directcommunication with or in indirect communication with via one or morecomponents named or unnamed herein (e.g., a memory card reader). Thehost 10 and the cache device 60 can be in communication with each othervia a wired or wireless connection and may be local to or remote fromone another. According to some embodiments, the cache device 60 caninclude pins (or a socket) to mate with a corresponding socket (or pins)on the host 10 to establish an electrical and physical connection.According to one or more other embodiments, the cache device 60 includesa wireless transceiver to place the host 10 and cache device 60 inwireless communication with each other.

The host 10 can take any suitable form, such as, but not limited to, anenterprise server, a database host, a workstation, a personal computer,a mobile phone, a game device, a personal digital assistant (PDA), anemail/text messaging device, a digital camera, a digital media (e.g.,MP3) player, a GPS navigation device, and a TV system. The cache device60 can also take any suitable form, such as, but not limited to, auniversal serial bus (USB) device, a memory card (e.g., an SD card), ahard disk drive (HDD), a solid state device (SSD), and a redundant arrayof independent disks (RAID). Also, instead of the host device 10 and thecache device 60 being separately housed from each other, such as whenthe host 10 is an enterprise server and the cache device 60 is anexternal card, the host 10 and the cache device 60 can be contained inthe same housing, such as when the host 10 is a notebook computer andthe cache device 60 is a hard disk drive (HDD) or solid-state device(SSD) internal to the housing of the computer.

As shown in FIG. 1, the host 10 can include application 20 and cachingsoftware 30. According to some embodiments, a key-value store may beimplemented on host 10 (e.g., in memory and/or on storage 40) andcaching software 30 and cache device 60 may not be present or utilized.In one or more embodiments, caching software 30 and/or cache device 60may be utilized to improve performance of a key-value store (e.g., bycaching frequently requested data referenced by a key-value store). Ingeneral, application 20 may reside on host 10 or remote from host 10.Application 20 may request data from caching software 30. Cachingsoftware 30 may be configured to send input/output (I/O) requests tocache device 60 via the bus 50. Caching software 30 may directinput/output (I/O) requests for data not stored on cache device 60 tostorage 40. The memory 80 stores data for use by the host 10. The cachedevice 60 contains a host interface 100, which is configured to receivecommands from and send acknowledgments to the host 10 using theinterface standard appropriate for the bus 50. The cache device 60 mayalso contain a memory device controller 70 operative to control variousoperations of the cache device 60, an optional Error Correcting Code(ECC) memory block 90 to perform ECC operations, and the memory 80itself.

The memory 80 can take any suitable form, such as, but not limited to, asolid-state memory (e.g., flash memory, or solid state device (SSD)),optical memory, and magnetic memory. While the memory 80 is preferablynon-volatile, a volatile memory also can be used. Also, the memory 80can be one-time programmable, few-time programmable, or many-timeprogrammable. In one embodiment, the memory 80 takes the form of a rawNAND die; however, a raw NOR die or other form of solid state memory canbe used.

The host 10 and the cache device 60 can include additional components,which are not shown in FIG. 1 to simplify the drawing. Also, in someembodiments, not all of the components shown are present. Further, thevarious controllers, blocks, and interfaces can be implemented in anysuitable fashion. For example, a controller can take the form of one ormore of a microprocessor or processor and a computer-readable mediumthat stores computer-readable program code (e.g., software or firmware)executable by the (micro)processor, logic gates, switches, anapplication specific integrated circuit (ASIC), a programmable logiccontroller, and an embedded microcontroller, for example.

Storage 40 may utilize a redundant array of inexpensive disks (“RAID”),magnetic tape, disk, Direct-attached storage (DAS), a storage areanetwork (“SAN”), an internet small computer systems interface (“iSCSI”)SAN, a Fibre Channel SAN, a common Internet File System (“CIFS”),network attached storage (“NAS”), a network file system (“NFS”), orother computer accessible storage.

Caching software 30 may receive I/O requests from application 20 and mayforward requests for cached data to cache device 60 and may returnretrieved data. Caching software 30 may forward I/O requests foruncached data to storage 40 and may return retrieved data. According tosome embodiments, caching software 30 may implement one or more cachingalgorithms to improve cache performance. For example, caching software30 may implement one or more of a Least Recently Used (LRU) algorithm, aLeast Frequently Used (LFU) algorithm, a Most Recently Used algorithm,or another caching algorithm.

According to some embodiments, caching software 30, another applicationor component of host 10 or an application or component of cache device60 may implement one or more key-value store management systems andmethods. According to one or more embodiments, a linked list, a list, oranother data structure referring to one or more trees of key-value pairsmay be contained in memory of Host 10 (e.g., SDRAM (Synchronous DynamicRandom Access Memory)). The trees used to hold key-value pairs maycomprise prefix trees, tries, and/or an ordered tree data structure.Trees may offer advantages in the speed of queries, range-queries,inserts, deletions, and other operations. The list or other datastructure may contain offsets referring to an uncompressed tree and oneor more compressed trees stored in memory 80 of cache device 60. Anuncompressed tree may be used for insertions, which may provideperformance advantages. One or more compressed trees may be used tostore a number of key-value pairs and may provide query performanceadvantages. Trees may be comprised of a root node, one or more internalnodes, and one or more leaf nodes. A node may be a record which may belinked to one or more additional records (nodes). The nodes may containa key and a value. As discussed in further detail in reference to FIGS.2-6 below, caching software 30 or another component may providefunctionality to manage one or more trees of key-value stores.

FIG. 2 depicts an exemplary module for managing key-value stores, inaccordance with an embodiment of the present disclosure. As illustratedin FIG. 2, key-value management module 210 may contain key-value treecreation module 212, key retrieval module 214, key-value treecompression module 216, and error logging and reporting module 218. Oneor more modules of FIG. 2 may be implemented on Host 10 of FIG. 1, onCache Device 60 of FIG. 1, on a combination of Host 10 and Cache Device60, or on another device (e.g., a separate server, storage, or cachedevice communicatively coupled to host 10).

Key-value tree creation module 212 may create or more key-value trees.Key-value tree creation module 212 may create an uncompressed tree. Thetrees used to hold key-value pairs may comprise prefix trees, tries,and/or an ordered tree data structure. Key-value tree creation module212 may insert new key-value data. In the event that the key isidentified in a tree, the data at an identified offset may be updatedwith the new value. In the event that the key is not identified in atree, a new key-value pair may be inserted in an uncompressed tree.

Key-value tree creation module 212 may create and maintain a list of anuncompressed tree and one or more compressed trees. The list or otherdata structure of one or more data trees may be, for example, a linkedlist that refers to a plurality of key-value trees. A first item in alinked list may be a reference to an uncompressed tree. Other items inthe linked list may include one or more references to compressed trees.References to trees in a list may be inserted as trees are created.Key-value tree creation module 212 may reorder references to key-valuetrees according to one or more factors. For example, metrics may bemaintained of trees containing the most frequently requested key-valuepairs, least frequently requested key-value pairs, most frequentlyupdated key-value pairs, trees having a highest percentage of deletedkeys, trees having a lowest percentage of deleted keys, or otherfactors. A tree having produced a higher number of hits to misses thananother tree may be listed earlier in a list or data structure, whichmay cause that tree to be traversed earlier in a search for a key.

Key retrieval module 214 may search one or more trees in a list toidentify a key. Key retrieval module 214 may start with an ordered list,a linked list, or another data structure containing a reference to oneor more key-value trees. Key retrieval module 214 may traverse a list toobtain an offset of each tree in order. Key retrieval module 214 maytraverse a tree to identify a key-value pair based on the provided key.

Key retrieval module 214 may receive and search using a range query. Therange query may be processed to retrieve one or more keys in a specifiedrange by traversing a tree structure to identify the one or more keys inone or more trees between two specified keys marking the boundaries ofthe range.

Key-value tree compression module 216 may monitor growth of anuncompressed tree. Key-value tree compression module 216 may compress anuncompressed tree when the uncompressed tree meets and/or exceeds aspecified threshold, and creating a new empty uncompressed tree.Compression may affect an amount of data stored in a leaf node of atree.

Compression of an uncompressed tree may occur in response to one or morefactors. For example, an uncompressed tree may be compressed based on apercentage of used space in a uncompressed tree, a number of keys in auncompressed tree, an amount of storage used by an uncompressed tree, apercentage of storage used by an uncompressed tree, a percentage ofmemory used by an uncompressed tree, an amount of memory used by anuncompressed tree, a percentage of cache used by an uncompressed tree,an amount of cache used by an uncompressed tree, a key retrievalperformance metric, a key insertion performance metric, a key deletionperformance metric, a range query performance metric, a current load ona processor, a current I/O load on a cache, a current disk I/O load, acurrent network I/O load, or other metrics specified by a user.

Key-value tree compression module 216 may further handle deletion ofkeys. Key-value tree compression module 216 may receive a request todelete a key-value pair from a tree. The key-value pair may be locatedvia a search of one or more trees. In the event that the key-value pairis located in an uncompressed tree, the key-value pair may be deleted.In the event that the key-value pair is located in a compressed tree,the key value pair may be marked for deletion. An amount, a percentage,or another measure of deleted keys in a compressed tree may be measuredby key-value tree compression module 216. If a percentage of acompressed tree used by deleted keys meets or exceeds a specifiedthreshold (e.g., fifty percent), reclamation of the space holding thedeleted keys may be performed by key-value tree compression module 216.Key-value pairs which have not been deleted but which are in acompressed tree holding more than a specified percentage of deleted keysmay be copied to an uncompressed tree. The compressed tree may then bedeleted.

Reclamation of unused space and deletion of keys may be initiated by oneor more events. A delete request may trigger an evaluation of spaceallocated to deleted keys in a particular tree, an evaluation ofavailable storage, an evaluation of an amount of tree and/or key-valuedata loaded into memory, an evaluation of available memory, anevaluation of available cache space, a comparison of a currentprocessing and/or cache load versus an estimated reclamation time orload, and/or an evaluation of one or more performance metrics. Deletionand/or reclamation of space may also be performed at other times (e.g.,periodic intervals such as nightly or during off-peak hours, or inresponse to an administrator command).

Error logging and reporting module 218 may handle errors occurringduring the management of key-value stores. For example, error loggingand reporting module 218 handle collisions, corruption of keys,corruption of data trees, or other errors.

FIG. 3 depicts a flowchart illustrating key-values stores during treecreation, in accordance with an embodiment of the present disclosure.The process 300, however, is exemplary. The process 300 can be altered,e.g., by having stages added, changed, removed, or rearranged. One ormore of the stages may be implemented on host 10 and/or cache device 60of FIG. 1. One or more stages may be implemented as modules as describedin reference to FIG. 2 above. The method 300 may begin at stage 302.

At stage 304, an uncompressed tree of key-value pairs may be created. Inaccordance, with one or more embodiments, the trees used to holdkey-value pairs may comprise prefix trees, tries, and/or an ordered treedata structure.

At stage 306, a node may be inserted into the uncompressed key-valuetree. The node may hold up a specified amount of data. According to someembodiments, in an uncompressed tree a key may be approximately 10 bytesand a node (including the key) may be approximately 20 bytes. A size ofa key and/or a node may depend upon an amount of memory, electronicstorage, an expected tree size, an expected number of trees, or otherfactors. A node size and/or key size of a compressed tree may besignificantly smaller.

The size or other characteristics of an uncompressed tree may bemonitored at stage 308. If the size of an uncompressed tree meets and/orexceeds a specified threshold the method 300 may continue at stage 310.If the size of an uncompressed tree is less than a specified thresholdthe method 300 may return to insert another node at stage 306 or may endat stage 316 (not shown).

At stage 310, the uncompressed tree may be compressed. Compression of anuncompressed tree may occur in response to one or more factors. Forexample, an uncompressed tree may be compressed based on a percentage ofused space in a uncompressed tree, a number of keys in a uncompressedtree, an amount of storage used by an uncompressed tree, a percentage ofstorage used by an uncompressed tree, an amount of memory used by anuncompressed tree, a percentage of memory used by an uncompressed tree,a percentage of cache used by an uncompressed tree, an amount of cacheused by an uncompressed tree, a key retrieval performance metric, a keyinsertion performance metric, a key deletion performance metric, a rangequery performance metric, a current load on a processor, a current I/Oload on a cache, a current disk I/O load, a current network I/O load, orother metrics specified by a user.

At stage 312, a new uncompressed tree may be created. According to someembodiments, uncompressed trees may be used to handle insert requests ofnew keys. When an uncompressed tree reaches a specified size, it may becompressed as discussed above in reference to stage 310. An uncompressedtree and one or more compressed trees may be associated by a datastructure such as a linked list.

At stage 314, a list or other data structure of one or more data treesmay be maintained. For example, a linked list may refer to a pluralityof data trees. A first item in a linked list may be a reference to anuncompressed tree. Other items in the linked list may include one ormore references to compressed trees. References to trees in a list maybe inserted as trees are created. References to trees may be reorderedaccording to one or more factors. For example, metrics may be maintainedof trees containing the most frequently requested key-value pairs, leastfrequently requested key-value pairs, most frequently updated key-valuepairs, trees having a highest percentage of deleted keys, trees having alowest percentage of deleted keys, or other factors. A tree havingproduced a higher number of hits or higher percentage of hits to missesthan another tree may be listed earlier in a list or data structure,which may cause it to be traversed earlier in a search for a key.

At stage 316, the method 300 may end.

FIG. 4 depicts a flowchart illustrating key-value stores during keyinsertion, in accordance with an embodiment of the present disclosure.The process 400, however, is exemplary only. The process 400 can bealtered, e.g., by having stages added, changed, removed, or rearranged.One or more of the stages may be implemented on host 10 and/or cachedevice 60 of FIG. 1. One or more stages may be implemented as modules asdescribed in reference to FIG. 2 above. The method 400 may begin atstage 402

At stage 404, a request to insert a key-value pair may be received. Anuncompressed tree and one or more compressed trees may be traversed atstage 406 (e.g., via using a list of trees containing root nodeoffsets).

A determination of whether a key has been found may be made at stage408. If a key has been found, the method 400 may continue at stage 410where the existing key may be updated with a new received value. If akey has not been found, the method 400 may continue at stage 412 where anew key-value pair may be inserted in an uncompressed tree.

At stage 414, the method 400 may end.

FIG. 5 depicts a flowchart illustrating reclamation of key-valuestorage, in accordance with an embodiment of the present disclosure. Theprocess 500, however, is exemplary. The process 500 can be altered,e.g., by having stages added, changed, removed, or rearranged. One ormore of the stages may be implemented on host 10 and/or cache device 60of FIG. 1. One or more stages may be implemented as modules as describedin reference to FIG. 2 above. The method 500 may begin at stage 502.

A request for deletion of a key-value pair may be received at stage 504.

At stage 506, it may be determined whether a key corresponding to thedelete request is in an uncompressed or a compressed tree. If a keycorresponding to the delete request is in an uncompressed tree, themethod 500 may delete the key-value pair at stage 508. If a keycorresponding to the delete request is in a compressed tree, the method500 may continue at stage 510.

A request for deletion of a key in a compressed tree may result in thecorresponding key-value pair being marked as deleted (e.g., the valuemay be set to an indicator such as, for example, −1) at stage 510.Deletion of a key in a compressed tree may require reorganization of thetree so the key may be maintained but indicated as deleted. In responseto the deletion request or in response to other factors, a percentage ofdeleted keys in a compressed tree may be compared against a specifiedthreshold at stage 512. An amount, a percentage, or another measure ofdeleted keys in a compressed tree may be measured. According to someembodiments, the threshold level may be dynamically adjusted based onone or more factors. If a percentage of a compressed tree used bydeleted keys meets or exceeds a specified threshold, reclamation of thespace holding the deleted keys may be performed. Key-value pairs whichhave not been deleted but which are in a compressed may be copied to anuncompressed tree at stage 514.

After copying all keys which have not been deleted to an uncompressedtree, the compressed tree may then be deleted at stage 516. At stage 518the method 500 may end.

Reclamation of unused space and deletion of keys may be initiated by oneor more events. A delete request may trigger an evaluation of spaceallocated to deleted keys in a particular tree, an evaluation ofavailable storage, an evaluation of an amount of tree and/or key-valuedata loaded into memory, an evaluation of available memory, anevaluation of available cache space, a comparison of a currentprocessing and/or cache load versus an estimated reclamation time orload, and/or an evaluation of one or more performance metrics. Deletionand/or reclamation of space may also be performed at other times (e.g.,periodic intervals such as nightly or during off-peak hours, or inresponse to an administrator command).

FIG. 6 shows an exemplary diagram of a plurality of key-value stores, inaccordance with an embodiment of the present disclosure. List 602 may bea linked list, an ordered list, or another data structure for organizinga plurality of trees of key-value pairs. Trees may be referenced by anoffset of a root node of a tree. The offset may indicate a file, cache,or other storage location for a root node of a tree. As depicted inexemplary diagram 600, list 602 may reference one or more tree rootnodes such as, for example, uncompressed tree root node 604, compressedtree root node 606A, and compressed tree root node 606B. Each root nodeof a tree may reference one or more secondary brand or leaf nodes. Nodesof uncompressed tree root node 604 including nodes 608A(1) . . . 608A(N)may be larger and may contain larger values than nodes of compressedtree root node 606A and compressed tree root node 606B. The number ofnodes is exemplary and may change due to insertions, deletions,reclamation of space used by deleted nodes and other factors. List 602may reference a plurality of trees of key-value pairs and trees ofkey-value pairs may be located in separate storage. For example, List602 may be maintained in DRAM on Host 10 of FIG. 1. The uncompressedtree beginning at root 604 may be on a first SSD associated with Host10. The compressed tree beginning at node 606A may be on a second SSDassociated with Host 10. The compressed tree beginning with node 606Bmay be on a disk associated with Host 10, a third SSD associated withHost 10, or other electronic storage associated with Host 10.

According to some embodiments, a number of compressed nodes 610A(1) . .. 610A(N) and 610B(1) . . . 610B(N) may be fewer in number than a numberof nodes in an uncompressed tree. However, a larger number of compressedtrees may be utilized. A number of nodes and trees of each type maydepend on many factors including, but not limited to, available cachespace, available DRAM, a number of key-value pairs, a user specifiedpreference, key retrieval performance metrics, and key insertionperformance metrics.

As described above, uncompressed nodes may be converted to compressednodes (e.g., in a new compressed tree of key-value pairs) in response toone or more metrics. Metrics initiating a conversion of uncompressednodes to compressed nodes may include a memory usage threshold, astorage space usage threshold, a number of keys in a tree, and one ormore performance metrics.

Additionally, as discussed above, in one or more embodiments, compressednodes may be converted to uncompressed nodes. For example, if a numberof keys marked as deleted in compressed tree 606A or 606B meets aspecified threshold, then the keys which have not been deleted may bemigrated to an uncompressed tree. Once undeleted keys for a compressedtree have been migrated to an uncompressed tree, the compressed tree maybe deleted.

Other embodiments are within the scope and spirit of the invention. Forexample, the functionality described above can be implemented usingsoftware, hardware, firmware, hardwiring, or combinations of any ofthese. One or more computer processors operating in accordance withinstructions may implement the functions associated with managingkey-value stores in accordance with the present disclosure as describedabove. If such is the case, it is within the scope of the presentdisclosure that such instructions may be stored on one or morenon-transitory processor readable storage media (e.g., a magnetic diskor other storage medium). Additionally, modules implementing functionsmay also be physically located at various positions, including beingdistributed such that portions of functions are implemented at differentphysical locations.

The present disclosure is not to be limited in scope by the specificembodiments described herein. Indeed, other various embodiments of andmodifications to the present disclosure, in addition to those describedherein, will be apparent to those of ordinary skill in the art from theforegoing description and accompanying drawings. Thus, such otherembodiments and modifications are intended to fall within the scope ofthe present disclosure. Further, although the present disclosure hasbeen described herein in the context of a particular implementation in aparticular environment for a particular purpose, those of ordinary skillin the art will recognize that its usefulness is not limited thereto andthat the present disclosure may be beneficially implemented in anynumber of environments for any number of purposes. Accordingly, theclaims set forth below should be construed in view of the full breadthand spirit of the present disclosure as described herein.

The invention claimed is:
 1. A method for improving key-value stores inelectronic storage comprising: creating an uncompressed tree ofkey-value pairs; monitoring, using a computer processor, growth of theuncompressed tree; compressing the uncompressed tree when theuncompressed tree meets a specified threshold; and creating a new emptyuncompressed tree.
 2. The method of claim 1, wherein the thresholdcomprises at least one of: a percentage of used space in a uncompressedtree, a number of keys in a uncompressed tree, an amount of storage usedby an uncompressed tree, a percentage of storage used by an uncompressedtree, a percentage of memory used by an uncompressed tree, an amount ofmemory used by an uncompressed tree, a percentage of cache used by anuncompressed tree, an amount of cache used by a uncompressed tree, a keyretrieval performance metric, a key insertion performance metric, a keydeletion performance metric, a range query performance metric, a currentload on a processor, a current I/O load on a cache, a current disk I/Oload, a current network I/O load, and a user specified metric.
 3. Themethod of claim 1, further comprising: maintaining a list of anuncompressed tree and one or more compressed trees; and searching one ormore trees in the list to identify a key.
 4. The method of claim 1,further comprising: receiving a request to insert new key-value data;searching one or more key-value trees; determining whether a keycorresponding to the new key-value data exists in a tree; in the eventthat the key is identified in a tree, updating data associated with thekey with the new key-value data; and in the event that the key is notidentified in a tree, inserting a new key-value pair in the uncompressedtree.
 5. The method of claim 1, further comprising: receiving a requestto delete a key-value pair from a tree; determining whether therequested key-value pair is in uncompressed tree or a compressed tree;in the event the requested key-value pair is in an uncompressed tree,deleting the requested key-value pair; and in the event the requestedkey-value pair is in a compressed tree, marking the requested key-valuepair for deletion.
 6. The method of claim 5, further comprising:determining that an amount of deleted key-value pairs in a tree havingat least one deleted key-value pair meets a specified threshold; andreclaiming space used by the deleted key-value pairs.
 7. The method ofclaim 6, wherein reclaiming space used by the deleted key-value pairscomprises: copying a key-value pair which has not been deleted from thetree having at least one deleted key-value pair to the uncompressedtree; and deleting the tree having at least one deleted key-value pair.8. The method of claim 6, wherein the threshold comprises at least oneof: a percentage of unused space in a compressed tree, a number ofdeleted keys in a compressed tree, an amount of storage used by ancompressed tree, a percentage of storage used by an compressed tree, apercentage of memory used by a compressed tree, an amount of memory usedby a compressed tree, a percentage of cache used by a compressed tree,an amount of cache used by a compressed tree, a key retrievalperformance metric, a key deletion performance metric, a range queryperformance metric, a current load on a processor, a current network I/Oload, a current cache I/O load, a current disk I/O load, and a userspecified metric.
 9. The method of claim 6, wherein reclaiming spaceused by the deleted key-value pairs is scheduled based on at least oneof: an off-peak time, a comparison of a current processing load versusan estimated reclamation load, a comparison of a current I/O load versusan estimated reclamation load, and a user command.
 10. The method ofclaim 1, further comprising: receiving first key indicating an upperrange boundary of a range query and a second key indicating an lowerrange boundary of the range query; and traversing one or more trees toreturn key-values within the boundaries of the range query.
 11. Themethod of claim 1, further comprising: maintaining a data structurecontaining a reference to the uncompressed tree and a reference to atleast one compressed tree.
 12. The method of claim 11, wherein thereference comprises a linked list.
 13. The method of claim 11, whereinthe data structure contains references to trees which are ordered basedon a specified metric.
 14. The method of claim 13, wherein the specifiedmetric comprises at least one of: trees having most frequently requestedkey-value pairs, trees having least frequently requested key-valuepairs, trees having most frequently updated key-value pairs, treeshaving a highest percentage of deleted keys, trees having a lowestpercentage of deleted keys, or a user specified metric.
 15. The methodof claim 11, wherein the data structure is maintained in memory of ahost device and references trees maintained on a cache device associatedwith the host device.
 16. The method of claim 11, wherein the referencecomprises an offset indicating a location of the tree in storage.
 17. Anarticle of manufacture for improving key-value stores in electronicstorage, the article of manufacture comprising: at least onenon-transitory processor readable storage medium; and instructionsstored on the at least one medium; wherein the instructions areconfigured to be readable from the at least one medium by at least oneprocessor and thereby cause the at least one processor to operate so asto: create an uncompressed tree of key-value pairs; monitor growth ofthe uncompressed tree; compress the uncompressed tree when theuncompressed tree meets a specified threshold; and create a new emptyuncompressed tree.
 18. The article of manufacture of claim 17, whereinthe instructions are further configured to execute and thereby cause theat least one processor to operate so as to: maintain a list of anuncompressed tree and one or more compressed trees; and search one ormore trees in the list to identify a key.
 19. The article of manufactureof claim 17, wherein the instructions are further configured to executeand thereby cause the at least one processor to operate so as to:receive a request to insert new key-value data; search one or morekey-value trees; determine whether a key corresponding to the newkey-value data exists in a tree; in the event that the key is identifiedin a tree, update data associated with the key with the new key-valuedata; and in the event that the key is not identified in a tree, inserta new key-value pair in the uncompressed tree.
 20. The article ofmanufacture of claim 17, wherein the instructions are further configuredto execute and thereby cause the at least one processor to operate so asto: receive a request to delete a key-value pair from a tree; determinewhether the requested key-value pair is in uncompressed tree or acompressed tree; in the event the requested key-value pair is in anuncompressed tree, delete the requested key-value pair; and in the eventthe requested key-value pair is in a compressed tree, mark the requestedkey-value pair for deletion.